AITopics | Berwickshire

Collaborating Authors

Berwickshire

Precise Information Control in Long-Form Text Generation

He, Jacqueline, Yen, Howard, Li, Margaret, Li, Shuyue Stella, Zeng, Zhiyuan, Shi, Weijia, Tsvetkov, Yulia, Chen, Danqi, Koh, Pang Wei, Zettlemoyer, Luke

arXiv.org Artificial IntelligenceOct-2-2025

A central challenge in language models (LMs) is faithfulness hallucination: the generation of information unsubstantiated by input context. To study this problem, we propose Precise Information Control (PIC), a new task formulation that requires models to generate long-form outputs grounded in a provided set of short self-contained statements, without adding any unsupported ones. PIC includes a full setting that tests a model's ability to include exactly all input claims, and a partial setting that requires the model to selectively incorporate only relevant claims. We present PIC-Bench, a benchmark of eight long-form generation tasks (e.g., summarization, biography generation) adapted to the PIC setting, where LMs are supplied with well-formed, verifiable input claims. Our evaluation of a range of open and proprietary LMs on PIC-Bench reveals that, surprisingly, state-of-the-art LMs still hallucinate against user-provided input in over 70% of generations. To alleviate this lack of faithfulness, we introduce a post-training framework that uses a weakly supervised preference data construction method to train an 8B PIC-LM with stronger PIC ability--improving from 69.1% to 91.0% F1 in the full PIC setting. When integrated into end-to-end factual generation pipelines, PIC-LM improves exact match recall by 17.1% on ambiguous QA with retrieval, and factual precision by 30.5% on a birthplace fact-checking task, underscoring the potential of precisely grounded generation.

large language model, machine learning, neural information processing system, (18 more...)

arXiv.org Artificial Intelligence

2506.06589

Country:

Africa > Democratic Republic of the Congo (0.28)
North America > Panama (0.14)
North America > United States > Washington > King County > Seattle (0.14)
(70 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Personal > Obituary (0.67)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)

Add feedback

Augmenting Query and Passage for Retrieval-Augmented Generation using LLMs for Open-Domain Question Answering

Kim, Minsang, Park, Cheoneum, Baek, Seungjun

arXiv.org Artificial IntelligenceJun-20-2024

Retrieval-augmented generation (RAG) has received much attention for Open-domain question-answering (ODQA) tasks as a means to compensate for the parametric knowledge of large language models (LLMs). While previous approaches focused on processing retrieved passages to remove irrelevant context, they still rely heavily on the quality of retrieved passages which can degrade if the question is ambiguous or complex. In this paper, we propose a simple yet efficient method called question and passage augmentation via LLMs for open-domain QA. Our method first decomposes the original questions into multiple-step sub-questions. By augmenting the original question with detailed sub-questions and planning, we are able to make the query more specific on what needs to be retrieved, improving the retrieval performance. In addition, to compensate for the case where the retrieved passages contain distracting information or divided opinions, we augment the retrieved passages with self-generated passages by LLMs to guide the answer extraction. Experimental results show that the proposed scheme outperforms the previous state-of-the-art and achieves significant performance gain over existing RAG methods.

augmentation, knowledge, retrieval, (15 more...)